Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition

نویسندگان

Stephen E. Levinson

M. Y. Liberman

Andrej Ljolje

L. G. Miller

چکیده

Speaker independent phonetic Iranscription of fluent speech is performed using an ergodic continuously variable duration hidden Markov model (CVDHMM) to represent the acoustic, phonetic and phonotactic structure of speech. An important property of the model is that each of its fifty-one states is uniquely identified with a single phonetic unit. Thus, for any spoken utterance, a phonetic transcription is obtained from a dynamic programming (DP) procedure for finding the state sequence of maximum likelihood. A model has been constructed based on 4020 sentences from the TIMIT database. When tested on 180 different sentences from this database, phonetic accuracy was observed to be 56% with 9% insertions. A speaker dependent version of the model was also constructed. The transcription algorithm was then combined with lexical access and parsing routines to form a complete recognition system. When tested on sentences from the DARPA resource management task spoken over the local switched telephone network, phonetic accuracy of 64% with 8% insertions and word accuracy of 87% with 3% insertions was measured. This system is presently operating in an on-line mode over the local switched telephone network in less than ten times real time on an Alliant FX-80.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The CSLU speaker recognition corpus

This paper describes the CSLU Speaker Recognition Corpus data collection. The corpus was motivated by a need for speech data from many speakers, under different environmental conditions, with each speaker providing data over a significant period of time. The corpus was designed to provide sufficient data to study phonetic variability within and across sessions, and to design and evaluate system...

متن کامل

A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)

In this paper, we describe the first Mandarin/Taiwanese (Min-nan) bi-lingual, continuous speech recognition system for large vocabulary or vocabulary-independent applications. A phonetic transcription system called Tong-yong Phonetic Alphabet (TYPA) is described and used to transcribe the bilingual Mandarin/Taiwanese lexicons. The Right-ContextDependent (RCD) phonetic continuous-density Hidden ...

متن کامل

Large Vocabulary Continuous Speech Recognition

Large vocabulary speaker-independent speech recognition systems being capable of recognizing continuous speech based on hidden Markov models are today’s standard. This review introduces the fundamentals of speech and the underlying speech recognition problems. The three classical approaches, i.e., the acoustic-phonetic, the statistical (pattern) recognition and the artificial intelligence appro...

متن کامل

Robust Continuous Speech Recognition

The pnrnary objective of this basic research program is to develop robust methods and models for speaker-independent acoustic recognition of spontaneously-produced, :ontinuous speech. The work has focussed on developing accurate and detailed models of phonemes and their coarticulation for the purpose of large-vocabulary continuous speech recognition. Important goals of this work are to achieve ...

متن کامل

Modeling coarticulation in EMG-based continuous speech recognition

This paper discusses the use of surface electromyography for automatic speech recognition. Electromyographic signals captured at the facial muscles record the activity of the human articulatory apparatus and thus allow to trace back a speech signal even if it is spoken silently. Since speech is captured before it gets airborne, the resulting signal is not masked by ambient noise. The resulting ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 1989

Speaker Independent Phonetic Transcription of Fluent Speech for Large Vocabulary Speech Recognition

نویسندگان

چکیده

منابع مشابه

The CSLU speaker recognition corpus

A bi-lingual Mandarin/taiwanese (min-nan), large vocabulary, continuous speech recognition system based on the tong-yong phonetic alphabet (TYPA)

Large Vocabulary Continuous Speech Recognition

Robust Continuous Speech Recognition

Modeling coarticulation in EMG-based continuous speech recognition

عنوان ژورنال:

اشتراک گذاری